CSCI 520                                                           Information Structures                                                          Fall 2008

Dr. Creider

Lab SP (Semester Project) Assignment

Comparison of Algorithms to Find the Mode in an Ordered Array

 

Write a program to compare the execution speed of different versions of an algorithm or different algorithms to find the mode in a list (array).  For lists, the mode is the most common (frequent) value.  A list can have more than one mode.  For histograms, a mode is a relative maximum ("bump").  A data set has no mode when all the numbers appear in the data with the same frequency. A data set has multiple modes when two or more values appear with the same frequency.  The data in the array will be ordered and will consist of at least 5 million numbers in the range of signed long.  You will continue examining different algorithms or modifications until you are satisfied that you found the fastest method to solve the problem. 

 

Use the following guidelines to complete this assignment:

 

Write a function and any other code necessary to return the mode in the parameter list if found.  You can write as many functions as you need to solve this problem, however, you must have at least one function and any additional functions, if any, must be called from the ‘controlling function’.  Return a 1 if a unique mode was found and assign the mode value to the last argument.  Return a zero if there is no mode because all the values occurred with the same frequency, or return a 2 if the mode is not unique.  The first line of the function definition is as follows:

 

int mode (long *data, long data_size, long &mode)

 

You cannot permanently change the contents of the array in which you are checking for duplicates.  You can use any algorithm to solve the problem including algorithms from 515.  If you create any structures, including additional arrays, or write additional functions, the structure definitions and prototypes must be placed at the beginning of the ‘controlling function’.  Any additional memory allocated in the ‘controlling function’ must be deleted before the function terminates.

 

You must keep a record of each algorithm or modification of an algorithm that you timed.  The simplest method may be to number them beginning with 1.  You will record in a written document each algorithm or modification to an algorithm that you timed including the actual timing value.  You must briefly describe the algorithm besides including the actual code.  The timing code is available on the ‘GA’ computer in the 101/102 lab along with 3 data sets of 10 million values each that provide for the 3 values to be returned from the function.  Your written report (done in Word 2003) must include your name, class id, campus wide id and the semester in which the project was completed.  The report must also include a graph (no pie chart) showing the difference in the time for each algorithm.  Do not include the timing code in the written document; just mention which method you used, C++ method or the Windows method.

 

You will be graded on the completeness of your written document and the fastest time that you were able to achieve. 

 

Assignment is due December 1, 2008.  Upload the assignment before or at least on the due date.